With phoneme code you intercept the data half-way in the process of converting text into sound data. But you can also intercept the sound data at the end before it reaches the speaker, using the SP Voice sound function.
This is a speech synthesizer depended feature and so far only possible with the MacinTalk Pro voice synthesizer, but this is largely undocumented.
The MacinTalk Pro synthesizer returns 16-bit sound data. If you want to store it as 8-bit data it is converted for you. For synthesized sound 16-bit is a bit overdone because you probably won't hear a difference.
But what's the difference between 8-bit and 16-bit sound data???
The quality of 16 bit sound is better and more accurate samples can be taken, or calculated like the Speech Manager does. However, the 16-bit sound data takes twice as much memory and disk space than 8-bit data does.
Another disadvantage is that the 16-bit data can only be played on machines which have the Sound Manager 3.0 (or later) installed* , while it can be created on every machine which supports the MacinTalk Pro voices. If you try to play a 16-bit sound resource without the new Sound Manager you won't hear anything.
Since the MacinTalk Pro voice synthesizer returns 16-bit data, Speech Pack converts it into 8-bit data when requested. This is a slow process and thus creating 16-bit data is a lot faster. (Although the initial code to convert the data took twice as much time as the current, optimized code.)
If you want to determine if the machine your application currently runs on supports 16-bit sound data, you can use the Gestalt Pro external I published. The documentation with it shows you how to do this.
* The Sound Manager 3.0 is distributed separatly but is also included by the Hardware System Update 2.0 and later.